(put photo of elk here)
The data we will be working with throughout this course consist of GPS tracking data for Ya Ha Tinda elk (Cervus canadensis) from Hebblewhite et al 2008 and the Ya Ha Tinda Elk Project.
Some elk in this population are migratory (migrating to to Banff National Park in the summer), others are residential to the Ya Ha Tinda area year-round, and others have demonstrated “switching” behavior between residential and migratory.
This data is also available on Movebank.
Data that has been processed “smartly” will have the following features:
Compartamentalized - e.g., each step in your code/methods uses functions and the most efficient code possible
Interactive - e.g., leverage visualization and interactive tools
Generalizable - e.g., able to be applied to multiple individuals
Replicable - e.g., saving your code and data products regularly and NEVER overwriting the raw data
Well-Documented - e.g., commenting your code along the way, keeping files in organized folders, and storing methods in an external document as you go
Follow these guidelines and you will save yourself from many future data processing head aches!
Data can be brought into R many ways:
read.csv - a Base R function to read in CSV files by
calling the file path of wherever the file is stored
load - a Base R function to load in an Rda object
(“R data object”)
getMovebankData - function from the “move” package
(Kranstauber et al 2023) to retrieve Movebank datasets by name
Excel loves to mess up date/time information, so be sure to check that your datetime column is formatted correctly (to include both the data AND time, with hours, minutes, & seconds, if applicable) before reading it into R.
The file path should correspond to wherever you saved your CSV file on your computer.
Note: It is helpful to first set your working directory
(setwd), so that you don’t have to call the entire file
every time.
elk_gps <- read.csv("./data/Elk_GPS_data.csv")
str(elk_gps)
## 'data.frame': 138433 obs. of 7 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ timestamp : chr "3/25/2003 19:01:00" "3/25/2003 23:01:00" "3/26/2003 1:01:00" "3/26/2003 5:00:00" ...
## $ location.long : num -115 -115 -115 -115 -115 ...
## $ location.lat : num 51.7 51.7 51.7 51.7 51.7 ...
## $ migration.stage : int 1 1 1 1 1 1 1 1 1 1 ...
## $ sensor.type : chr "gps" "gps" "gps" "gps" ...
## $ individual.local.identifier: chr "GP1" "GP1" "GP1" "GP1" ...
Rda objects are a useful way to store data. Essentially, R stores your object(s) as a compressed “.rda” file (R file type). This does not work with raster data, but will work with most other object types.
It’s also good practice to save your intermediate R objects in a “Data” folder in your R project or repository. For example, a great time to save your data would be after processing!
You can save data using the save Base R function and
then load it, using the same file path you saved it to.
When saving, don’t forget to add the file name and type at the end of the file path!
save(elk_gps, file="./data/elk_gps.rda")
load("./data/elk_gps.rda")
str(elk_gps)
## 'data.frame': 138433 obs. of 6 variables:
## $ timestamp : chr "3/25/2003 19:01:00" "3/25/2003 23:01:00" "3/26/2003 1:01:00" "3/26/2003 5:00:00" ...
## $ location.long : num -115 -115 -115 -115 -115 ...
## $ location.lat : num 51.7 51.7 51.7 51.7 51.7 ...
## $ migration.stage : int 1 1 1 1 1 1 1 1 1 1 ...
## $ sensor.type : chr "gps" "gps" "gps" "gps" ...
## $ individual.local.identifier: chr "GP1" "GP1" "GP1" "GP1" ...
Movebank is an incredibly useful (online) resource and repository for storing and accessing tracking datasets for a variety of species.
Public data can be downloaded either online or through the “move” R
package. After installing the package (install.package), we
can load the package for use in our R session using the
library Base R function.
library(move)
You will need to make a Movebank account and login first, using the
movebankLogin function.
mylogin <- movebankLogin(username = 'YourUsername', password = 'yourpassword')
Now you can use your login information object within the
getMovebankData function to access the study you are
interested in.
Let’s access one of the Elk Movebank Datasets.
IMPORTANT - you first need to go the study page on Movebank, log in with your credentials, and under the “Download” tab, click “Download Data” and then agree to the license agreement. You can now download the data on the web to your computer OR use the function below to download the data.
elk_move <- getMovebankData(study = "Ya Ha Tinda elk project, Banff National Park, 2001-2023 (females)", login = mylogin, removeDuplicatedTimestamps=TRUE)
head(elk_move)
## tag_id sensor_type_id external_temperature gps_dop height_above_ellipsoid
## 1 1200770822 653 NA NA NA
## 2 1200770822 653 NA NA NA
## 3 1200770822 653 NA NA NA
## 4 1200770822 653 NA NA NA
## 5 1200770822 653 NA NA NA
## 6 1200770822 653 NA NA NA
## location_lat location_long manually_marked_outlier timestamp
## 1 52.12410 -115.8044 2001-12-13 07:01:12
## 2 52.11762 -115.8003 2001-12-13 09:01:07
## 3 52.09611 -115.8281 2001-12-14 09:01:05
## 4 52.09829 -115.8318 2001-12-14 11:00:49
## 5 52.09482 -115.8042 2001-12-14 17:02:19
## 6 52.12493 -115.8037 2001-12-14 19:01:07
## update_ts visible deployment_id event_id sensor_type
## 1 2024-04-22 14:52:38.396 true 3662883980 15155700828 GPS
## 2 2024-04-22 14:52:38.396 true 3662883980 15155702143 GPS
## 3 2024-04-22 14:52:38.396 true 3662883980 15155694994 GPS
## 4 2024-04-22 14:52:38.396 true 3662883980 15155693463 GPS
## 5 2024-04-22 14:52:38.396 true 3662883980 15155700885 GPS
## 6 2024-04-22 14:52:38.396 true 3662883980 15155700986 GPS
## tag_local_identifier
## 1 4049
## 2 4049
## 3 4049
## 4 4049
## 5 4049
## 6 4049
The getMovebankData function will result in a
“MoveStack” object, which is specially formatted for the “move” package
functions.
We can do a quick plot of the elk locations in this dataset using the
plot function on our MoveStack object.
plot(elk_move)
We can convert it to a basic R data frame object using the
as.data.frame function.
elk_df <- as.data.frame(elk_move)
str(elk_df)
## 'data.frame': 1742248 obs. of 50 variables:
## $ tag_id : num 1.2e+09 1.2e+09 1.2e+09 1.2e+09 1.2e+09 ...
## $ sensor_type_id : int 653 653 653 653 653 653 653 653 653 653 ...
## $ external_temperature : num NA NA NA NA NA NA NA NA NA NA ...
## $ gps_dop : num NA NA NA NA NA NA NA NA NA NA ...
## $ height_above_ellipsoid : num NA NA NA NA NA NA NA NA NA NA ...
## $ location_lat : num 52.1 52.1 52.1 52.1 52.1 ...
## $ location_long : num -116 -116 -116 -116 -116 ...
## $ manually_marked_outlier: chr "" "" "" "" ...
## $ timestamp : POSIXct, format: "2001-12-13 07:01:12" "2001-12-13 09:01:07" ...
## $ update_ts : chr "2024-04-22 14:52:38.396" "2024-04-22 14:52:38.396" "2024-04-22 14:52:38.396" "2024-04-22 14:52:38.396" ...
## $ visible : chr "true" "true" "true" "true" ...
## $ deployment_id : num 3.66e+09 3.66e+09 3.66e+09 3.66e+09 3.66e+09 ...
## $ event_id : num 1.52e+10 1.52e+10 1.52e+10 1.52e+10 1.52e+10 ...
## $ sensor_type : Factor w/ 1 level "GPS": 1 1 1 1 1 1 1 1 1 1 ...
## $ tag_local_identifier : chr "4049" "4049" "4049" "4049" ...
## $ location_long.1 : num -116 -116 -116 -116 -116 ...
## $ location_lat.1 : num 52.1 52.1 52.1 52.1 52.1 ...
## $ optional : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
## $ sensor : Factor w/ 1 level "GPS": 1 1 1 1 1 1 1 1 1 1 ...
## $ timestamps : POSIXct, format: "2001-12-13 07:01:12" "2001-12-13 09:01:07" ...
## $ trackId : Factor w/ 206 levels "X4049","BL201",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ birth_hatch_latitude : logi NA NA NA NA NA NA ...
## $ birth_hatch_longitude : logi NA NA NA NA NA NA ...
## $ comments : logi NA NA NA NA NA NA ...
## $ death_comments : logi NA NA NA NA NA NA ...
## $ earliest_date_born : logi NA NA NA NA NA NA ...
## $ exact_date_of_birth : logi NA NA NA NA NA NA ...
## $ group_id : logi NA NA NA NA NA NA ...
## $ individual_id : num 1.2e+09 1.2e+09 1.2e+09 1.2e+09 1.2e+09 ...
## $ latest_date_born : logi NA NA NA NA NA NA ...
## $ local_identifier : chr "4049" "4049" "4049" "4049" ...
## $ marker_id : logi NA NA NA NA NA NA ...
## $ mates : logi NA NA NA NA NA NA ...
## $ mortality_date : logi NA NA NA NA NA NA ...
## $ mortality_latitude : logi NA NA NA NA NA NA ...
## $ mortality_longitude : logi NA NA NA NA NA NA ...
## $ mortality_type : logi NA NA NA NA NA NA ...
## $ nick_name : logi NA NA NA NA NA NA ...
## $ offspring : logi NA NA NA NA NA NA ...
## $ parents : logi NA NA NA NA NA NA ...
## $ ring_id : logi NA NA NA NA NA NA ...
## $ sex : chr "f" "f" "f" "f" ...
## $ siblings : logi NA NA NA NA NA NA ...
## $ taxon_canonical_name : chr "Cervus elaphus" "Cervus elaphus" "Cervus elaphus" "Cervus elaphus" ...
## $ timestamp_start : chr "2001-12-13 07:01:12.000" "2001-12-13 07:01:12.000" "2001-12-13 07:01:12.000" "2001-12-13 07:01:12.000" ...
## $ timestamp_end : chr "2002-11-14 03:00:55.000" "2002-11-14 03:00:55.000" "2002-11-14 03:00:55.000" "2002-11-14 03:00:55.000" ...
## $ number_of_events : int 3247 3247 3247 3247 3247 3247 3247 3247 3247 3247 ...
## $ number_of_deployments : int 1 1 1 1 1 1 1 1 1 1 ...
## $ sensor_type_ids : chr "gps" "gps" "gps" "gps" ...
## $ taxon_detail : chr "ssp. canadensis" "ssp. canadensis" "ssp. canadensis" "ssp. canadensis" ...
Processing data will always be specific to your data and needs. Sometimes it can be helpful to do some processing and data cleaning outside of R (e.g., within Excel, especially for datetime information).
You find it useful to diagram or write out your data processing needs BEFORE trying to draft your code.
R is a powerful tool for quick, efficient, and reproducible data processing and cleaning. If there is ever something you don’t know how to do in R, a quick Google search or taking a look at one of the many R resources online (e.g., R-Bloggers or Stack Overflow) will likely eventually result in a solution.
The “[]” operator can be used to grab specific columns by their number in a dataframe.
Let’s drop the 4th and 5th columns from our “elk_gps” dataset.
elk_gps <- elk_gps[, -c(4:5)]
str(elk_gps)
## 'data.frame': 138433 obs. of 4 variables:
## $ timestamp : chr "3/25/2003 19:01:00" "3/25/2003 23:01:00" "3/26/2003 1:01:00" "3/26/2003 5:00:00" ...
## $ location.long : num -115 -115 -115 -115 -115 ...
## $ location.lat : num 51.7 51.7 51.7 51.7 51.7 ...
## $ individual.local.identifier: chr "GP1" "GP1" "GP1" "GP1" ...
Columns can be renamed using the names Base R function
and a vector of the names you want, as text (character format) and in
the order you want.
Let’s rename our columns to “datetime”, “lon”, “lat”, and “id”.
Note the structure of your data as well. R has many different data structures but the main ones you will use are numeric, character (factor is similar but has levels), Posixct, and spatial. We will
names(elk_gps) <- c("datetime", "lon", "lat", "id")
str(elk_gps)
## 'data.frame': 138433 obs. of 4 variables:
## $ datetime: chr "3/25/2003 19:01:00" "3/25/2003 23:01:00" "3/26/2003 1:01:00" "3/26/2003 5:00:00" ...
## $ lon : num -115 -115 -115 -115 -115 ...
## $ lat : num 51.7 51.7 51.7 51.7 51.7 ...
## $ id : chr "GP1" "GP1" "GP1" "GP1" ...
R reads datetime data in a particular way. Converting your datetime column to “POSIXct” format will ensure that R reads that column as a datetime information, not just a character or text.
To format a column to be “POSIXt” format, you can use the
as.POSIXct function.
Here, you need to be careful to specify the format of the datetime column exactly as it is, using Posixct syntax (eg, “%m” for month, “%d” for day, “%Y” for year, “%H” for hours, “%M” for minutes, and “%S” for seconds).
Let’s check the format of our datetime column:
elk_gps$datetime[1]
## [1] "3/25/2003 19:01:00"
Now we specify this format with POSIXct syntax in the “format”
argument of the as.POSIXct function.
elk_gps$datetime <- as.POSIXct(elk_gps$datetime, format="%m/%d/%Y %H:%M:%S")
str(elk_gps)
## 'data.frame': 138433 obs. of 4 variables:
## $ datetime: POSIXct, format: "2003-03-25 19:01:00" "2003-03-25 23:01:00" ...
## $ lon : num -115 -115 -115 -115 -115 ...
## $ lat : num 51.7 51.7 51.7 51.7 51.7 ...
## $ id : chr "GP1" "GP1" "GP1" "GP1" ...
Now that our datetime column is in POSIXct format, we can perform mathemetical operations on this column and return the results in units of time.
elk_gps$datetime[2] - elk_gps$datetime[1]
## Time difference of 4 hours
For example, we can use the difftime function to take
the difference in time between our first and second observations in the
datetime column, specifying the desired units for the output:
difftime(elk_gps$datetime[2] , elk_gps$datetime[1], units = "mins")
## Time difference of 240 mins
R stores missing values as “NA” (or sometimes, “NaN”).
You can check for NA values in a vector or column using the
is.na function, which handily will return a vector the same
length, with TRUE where there are NA’s and FALSE where there are not
NA’s.
Let’s use the subset Base R function to select only the
rows in our elk data where there are no NA’s for datetime, lat,
and lon.
elk_gps2 <- subset(elk_gps, is.na(datetime)==FALSE & is.na(lon)==FALSE & is.na(lat)==FALSE)
str(elk_gps2)
## 'data.frame': 138429 obs. of 4 variables:
## $ datetime: POSIXct, format: "2003-03-25 19:01:00" "2003-03-25 23:01:00" ...
## $ lon : num -115 -115 -115 -115 -115 ...
## $ lat : num 51.7 51.7 51.7 51.7 51.7 ...
## $ id : chr "GP1" "GP1" "GP1" "GP1" ...
Duplicated data can falsely inflate your data with observations and result in conflicts with various functions used for analyses later.
We will use the “dplyr” R package for more data organizing and cleaning.
library(dplyr)
We will use the “|>” native R pipe to efficiently run multiple functions at once on an object.
Before filtering out duplicate datetime data (using the
filter function to select the rows that are not duplicated,
using the “!” condition) and sorting our data by datetime information
(using the arrange function, which automatically sorts in
increasing order), we first need to group the data by each individual in
our dataset (our “id” column) so that the functions are applied to each
individual separately.
We then ungroup our data to combine the results among
all individuals and use the data.frame function to ensure
the resulting object is a dataframe structure.
elk_gps3 <- elk_gps2 |>
group_by(id) |>
filter(!duplicated(datetime)) |>
arrange(datetime) |>
ungroup() |>
data.frame()
head(elk_gps3)
## datetime lon lat id
## 1 2001-12-13 07:01:00 -115.8043 52.12410 4049
## 2 2001-12-13 09:01:00 -115.8003 52.11762 4049
## 3 2001-12-14 09:01:00 -115.8281 52.09611 4049
## 4 2001-12-14 11:00:00 -115.8318 52.09829 4049
## 5 2001-12-14 17:02:00 -115.8042 52.09482 4049
## 6 2001-12-14 19:01:00 -115.8037 52.12493 4049
Spatial data in R comes in multiple formats (vectors, e.g. points, lines, and polygons, or rasters). You may find the online resource Spatial Data Science with R and “terra” helpful for further information.
There are multiple packages in R for formatting spatial data but a fan favorite for vector data is the “sf” package.
Raster data can be processed using the “raster” package or more recently, the “terra” package.
library(sf)
If you are working with movement/tracking data, you should have columns with geographic coordinate information for each observation in the dataset, with this information often being stored in latitude/longitude format (units: angular decimal degrees).
Let’s convert our elk data to “sf” format, using the
st_as_sf function, specifying the columns with our
coordinate information (latitude first, then longitude) and the
Coordinate Reference System EPSG code (4326 corresponds to WGS
1984, a Geographic Coordinate System for data with coordinates in units
of decimal degrees).
Our new sf object is a “POINt” vector type, with each location/observation in the data having a corresponding POINT geometry. The sf package has many amazing functions to manipulate spatial data, including spatial operations and conversion to different vector types.
elk_sf <- elk_gps3 |>
st_as_sf(coords = c("lon","lat"), crs=4326)
elk_sf
## Simple feature collection with 138421 features and 2 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -116.4028 ymin: 51.38134 xmax: -115.3461 ymax: 52.1541
## Geodetic CRS: WGS 84
## First 10 features:
## datetime id geometry
## 1 2001-12-13 07:01:00 4049 POINT (-115.8043 52.1241)
## 2 2001-12-13 09:01:00 4049 POINT (-115.8003 52.11762)
## 3 2001-12-14 09:01:00 4049 POINT (-115.8281 52.09611)
## 4 2001-12-14 11:00:00 4049 POINT (-115.8318 52.09829)
## 5 2001-12-14 17:02:00 4049 POINT (-115.8042 52.09482)
## 6 2001-12-14 19:01:00 4049 POINT (-115.8037 52.12493)
## 7 2001-12-14 21:00:00 4049 POINT (-115.7985 52.12587)
## 8 2001-12-15 01:01:00 4049 POINT (-115.7973 52.10518)
## 9 2001-12-15 03:01:00 4049 POINT (-115.7875 52.08844)
## 10 2001-12-15 05:01:00 4049 POINT (-115.8239 52.09922)
Visualizing data is an important step in the data processing and analysis stages.
Learning how to visualize your data correctly can help you catch errors in your code, outliers in your data, and interesting patterns that will inform your analysis choices.
Visualization can be done with Base R plotting or the “ggplot2” R package, which is excellent for creating complex plots in one line of code.
library(ggplot2)
We can use base R to specify a “fancy” plot showing each of our tracks in multiple dimensions, including latitude/longitiude, latitude versus time, and longitude versus time.
Note that using a geographic coordinate system (GCS) with units in angular degrees (lat/lon) can be useful for mapping and observing patterns in the data but that direct, measurable comparisons can only be made with a projected coordinate system (PCS), where the units are in measurable units (e.g., meters). We will demonstrate how to work with projected coordinates in the next lab.
Note: You may find this online resource on Plotting
with Base R useful if you are unfamiliar with the plot
function and its various arguments.
Let’s make a base R plot of the track for the individual, “GP1”
GP1 <- subset(elk_gps3, id == "GP1")
par(mar = c(0,4,0,0), oma = c(4,0,5,2), xpd=NA)
layout(rbind(c(1,2), c(1,3)))
plot(GP1$lon, GP1$lat, asp = 1, type="o", ylab="Latitude", xlab="Longitude")
plot(GP1$datetime, GP1$lon, type="o", xaxt="n", ylab="Longitude", xlab="")
plot(GP1$datetime, GP1$lat, type="o", ylab="Latitude", xlab="Datetime")
title(paste("ID", GP1$id[1]), outer = TRUE)
We can also write our own little function, using the
function function, to apply this plot to all individuals at
once.
For more information/help on writing functions in R, check out this helpful online resource, Writing Functions in R.
plotTrack_latlon <- function(dataframe){
par(mar = c(0,4,0,0), oma = c(4,0,5,2), xpd=NA)
layout(rbind(c(1,2), c(1,3)))
plot(dataframe$lon, dataframe$lat, asp = 1, type="o", ylab="Latitude", xlab="Longitude")
plot(dataframe$datetime, dataframe$lon, type="o", xaxt="n", ylab="Longitude", xlab="")
plot(dataframe$datetime, dataframe$lat, type="o", ylab="Latitude", xlab="Datetime")
title(paste("ID", dataframe$id[1]), outer = TRUE)
}
We can use the “plyr” package with the d_ply function to
apply our new function grouped by a variable of interest, here each
individual. This function will simply return the output of the function
used (see also ddply, which we will use later to return a
dataframe based on a function’s output).
Note: plyr and dplyr do not “play nice” with each other. You
need to load the dplyr package AFTER the plyr package to avoid conflicts
and function masking. We can use the detach function to
unload our dplyr package, then use the library function to
load the plyr and dplyr packages, consecutively. If you see that a
particular function is “masked” you can use the package name and “:”
before the name of a particular function from a particular package to
ensure that masking (which happens when you have functions of the same
name in two packages that are loaded) does not occur (e.g.,
dplyr::select()).
detach("package:dplyr", unload=TRUE)
library(plyr)
library(dplyr)
elk_gps3 |> d_ply("id", plotTrack_latlon)
The ggplot function takes arguments for the data object
to be plotted, the x and y axis variables to be plotted (within the
aes() function), a variety of additional aesthetic
arguments (e.g., color or linewidth), and additive functions that define
the plot type.
library(ggplot2)
For more help with plotting with ggplot, check out this helpful online resource, ggplot2 with the tidyverse.
Let’s make a plot for the elk individual “GP1”, using latitude for the X axis, longitude for the y-axis, and using points and a connecting “path” between points to plot the track.
ggplot(data = GP1, aes(x = lon, y = lat)) +
geom_point()+
geom_path(size = 0.5, color = "darkgrey") +
theme_classic()
Now let’s visualize all individuals at once, using the additional
facet_wrap function to facet or group our plots by each elk
id. The “scale = ‘free’” argument allows each plot to have its own
intuitive x and y axis limits (“scale=‘fixed’ does the opposite).
ggplot(data = elk_gps3, aes(x = lon, y = lat)) +
geom_path(size = 0.5, color = "darkgrey") +
geom_point() +
theme_classic() +
facet_wrap(~id, scale="free", ncol=3)
We can also visualize latitude 9or longitude) versus datetime.
This can be especially helpful for identifying interesting behavioral patterns in animal movement (e.g., residency vs transiting), which can inform your choice of analysis method later on.
ggplot(data = elk_gps3, aes(x = datetime, y = lat)) +
geom_path(size = 0.5) +
xlab("DateTime") + ylab("Latitude") +
theme_classic() +
facet_wrap(~id, scale="free", ncol = 3)
sf objects can be plotted by their attributes using the
plot Base R function.
For example, we can plot the points (or “geometry”, which are points) for the “GP1” elk individual, after subsetting its data from the larger “elk_sf” object we created above.
GP1 <- subset(elk_sf, id == "GP1")
plot(GP1$geometry, pch = 19)
Without a background map, this plot is not very informative!
The “ggmap” package works helpfully with ggplot functions and sf data to add open-source basemaps. More info on the package can be found on the ggmap Github Repo Page.
library(ggmap)
Importantly, you need internet access to download the open-source base map “tiles”.
You also FIRST need to register for a free API key with
Stadia Maps at their API
Signup Page (see ?register_stadiamaps).
After you complete your registration, go to your client dashboard and create a new “property” (e.g., “Nicki’s API”). You can now create a new API key (save this somewhere on your computer so that you can find it if needed). You can also find instructions under the Stadia API Documentation Page.
key <- "e644fe03-1f7b-4b05-87a8-c65335eb4625"
register_stadiamaps(key, write = FALSE)
Now, before creating our map, we need to define the spatial extent for the basemap, defining the extent as a “box” defined by 2 points (left, bottom and top, right).
GP1_bbox <- st_bbox(GP1)
names(GP1_bbox) <- c("left","bottom","right","top")
GP1_bbox
## left bottom right top
## -115.49872 51.66008 -115.46927 51.68975
GP1_box <- c(left = -115.6, bottom = 51.5, right = -115.4, top = 51.75)
We might want to add a buffer around this bbox, to have a bigger spatial extent than our points. For example, we could 5 degrees in each direction:
GP1_bbox
## left bottom right top
## -115.49872 51.66008 -115.46927 51.68975
Next, we use the get_map function on our new bbox to
grab the basemap, specifying source “stadia” to use the open source base
maps. Note that there are different maptypes available, which will
change the visual presentation of the basemap (see
?get_map, we are using “stamen_terrain”).
basemap <- get_map(GP1_box,
source = "stadia",
maptype = "stamen_terrain")
Annoyingly, defining an sf object removes the spatial columns from
the data (they are stored inside an st_geometry column
instead). We can extract each of the lat/long columns, using the
st_coordinates function on our sf object (stored as a
matrix, with long first, then lat).
GP1$lon <- st_coordinates(GP1)[,1]
GP1$lat <- st_coordinates(GP1)[,2]
ggmap(basemap, extent = "normal") +
geom_point(data = GP1, color="red") +
theme_classic()
We can use the “ggspatial” package to add open map tiles to the background of our map.
Note that this can be memory-intensive if you are plotting a lot of data at a high resolution …
library(ggspatial)
The “ggspatial” R package is great for maps using Open Street Map (OSM) base map tiles (open source).
The annotation_map_tile function allows you to specify a
background map type (“type=”) and zoom level (“zoom=”, where a higher
zoom is a higher resolution but may take longer to render). You can run
the code “rosm::osm.types” to see all the different map tile types
available (we will use the “osm” one).
You can also add some nice map features, such as a scale bar
(function annotation_scale()) and a north arrow (function
annotation_north_arrow(), with arguments for specifying
height, width, padding dimensions). You can also specify nicer axis
labels and breaks using the scale_y_continuous and
scale_x_continuous functions, specifying the
You can then add on your other, regular ggplot functions, such as
geom_sf and theme options.
box <- st_bbox(c(xmin = -115.6, xmax = -115.4, ymax = 51.5, ymin = 51.75), crs = st_crs(4326))
ggplot() +
annotation_map_tile(type = 'osm', zoom = 12) +
annotation_scale()+
annotation_north_arrow(height=unit(0.5,"cm"), width=unit(0.5,"cm"), pad_y = unit(1,"cm"))+
shadow_spatial(box)+
ylab("Latitude") + xlab("Longitude")+
scale_y_continuous(breaks= c(51.5, 51.6, 51.7, 51.75))+
scale_x_continuous(breaks= c(-115.6, -115.5, -115.4))+
geom_sf(data=GP1,aes(), color="orange", size=2)+
theme_classic()
If you wanted to export your map, you could do so by sandwhiching
your map code between the file type function you want the image to be
saved as (e.g., jpeg or png) and the function
dev.off.
jpeg(file="./GP1_Map.jpg", units="in", width=4, height=7,res=300)
ggplot() +
annotation_map_tile(type = 'osm', zoom = 12) +
annotation_scale()+
annotation_north_arrow(height=unit(0.5,"cm"), width=unit(0.5,"cm"), pad_y = unit(1,"cm"))+
shadow_spatial(box)+
ylab("Latitude") + xlab("Longitude")+
scale_y_continuous(breaks= c(51.5, 51.6, 51.7, 51.75))+
scale_x_continuous(breaks= c(-115.6, -115.5, -115.4))+
geom_sf(data=GP1,aes(), color="orange", size=2)+
theme_classic()
dev.off()
Interactive mapping in R with the “mapview” R package is a useful way to visualize and engage with spatial data.
library(mapview)
Let’s create spatial tracks of all of our elk data, using our “elk_sf” object.
We use the summarize function and the
st_cast function with the group_by function
from the dplyr package to first create individual elk tracks, as
LINESTRINGS.
elk_tracks <- elk_sf |>
group_by(id) |>
summarize(do_union=FALSE) |>
st_cast("LINESTRING")
Now we can use the mapview function to plot our tracks,
specifying to color the tracks by different ids with the “zcol”
argument. Note that the mapview function can plot any
spatial data and has a variety of additional controls/arguments
available (see Advanced
Mapview Controls for more options and examples).
mapview(elk_tracks, zcol="id")
After processing your data, it is always useful to save it as an intermediate data object. Rda format is perfect for this, as it is an R specific file type that will load your saved objects directly back into your Global Environment for use.
elk_processed <- elk_gps3
save(elk_processed, file="./data/elk_processed.rda")